Univariate Stock Predictions: LSTM, ARIMA, and prophet

INFO 523 - Final Project

Project description
Author
Affiliation

Matt Osterhoudt

College of Information Science, University of Arizona

Abstract

Introduction/Question

Approach

Code & Visual Analysis

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 504 entries, 2022-12-29 00:00:00-05:00 to 2024-12-31 00:00:00-05:00
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Close   504 non-null    float64
dtypes: float64(1)
memory usage: 7.9 KB
p-value pre-difference: 0.851468991198204
p-value post-difference: 8.88101992266411e-28
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                  Close   No. Observations:                 2012
Model:                 ARIMA(9, 1, 9)   Log Likelihood               -5023.264
Date:                Thu, 21 Aug 2025   AIC                          10086.528
Time:                        01:11:41   BIC                          10198.656
Sample:                             0   HQIC                         10127.687
                               - 2012                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.0967      0.049      1.966      0.049       0.000       0.193
ar.L1          0.0495      0.074      0.665      0.506      -0.096       0.195
ar.L2         -0.1212      0.073     -1.666      0.096      -0.264       0.021
ar.L3          0.1209      0.070      1.729      0.084      -0.016       0.258
ar.L4         -0.1098      0.060     -1.824      0.068      -0.228       0.008
ar.L5         -0.0738      0.070     -1.061      0.289      -0.210       0.063
ar.L6         -0.0960      0.068     -1.403      0.160      -0.230       0.038
ar.L7          0.1091      0.067      1.621      0.105      -0.023       0.241
ar.L8          0.0780      0.059      1.323      0.186      -0.038       0.194
ar.L9          0.6292      0.066      9.584      0.000       0.501       0.758
ma.L1         -0.1468      0.078     -1.884      0.060      -0.299       0.006
ma.L2          0.1251      0.076      1.645      0.100      -0.024       0.274
ma.L3         -0.1691      0.072     -2.358      0.018      -0.310      -0.029
ma.L4          0.1244      0.065      1.902      0.057      -0.004       0.253
ma.L5          0.0770      0.073      1.052      0.293      -0.066       0.220
ma.L6          0.0191      0.072      0.265      0.791      -0.122       0.160
ma.L7         -0.0793      0.072     -1.108      0.268      -0.220       0.061
ma.L8         -0.1663      0.062     -2.664      0.008      -0.289      -0.044
ma.L9         -0.5046      0.072     -7.018      0.000      -0.646      -0.364
sigma2         8.6581      0.139     62.074      0.000       8.385       8.931
===================================================================================
Ljung-Box (L1) (Q):                   0.10   Jarque-Bera (JB):              4186.86
Prob(Q):                              0.75   Prob(JB):                         0.00
Heteroskedasticity (H):              44.77   Skew:                            -0.55
Prob(H) (two-sided):                  0.00   Kurtosis:                         9.98
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

p-value pre-difference: 0.7797885823411574
p-value post-difference: 3.7230748933144776e-26
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                  Close   No. Observations:                 2012
Model:                 ARIMA(9, 1, 6)   Log Likelihood               -5248.549
Date:                Thu, 21 Aug 2025   AIC                          10531.097
Time:                        01:11:50   BIC                          10626.406
Sample:                             0   HQIC                         10566.083
                               - 2012                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.0966      0.076      1.276      0.202      -0.052       0.245
ar.L1         -1.0289      0.190     -5.404      0.000      -1.402      -0.656
ar.L2         -0.1938      0.137     -1.412      0.158      -0.463       0.075
ar.L3          0.1363      0.099      1.377      0.169      -0.058       0.330
ar.L4         -0.6019      0.086     -7.016      0.000      -0.770      -0.434
ar.L5         -1.0544      0.166     -6.363      0.000      -1.379      -0.730
ar.L6         -0.6232      0.119     -5.242      0.000      -0.856      -0.390
ar.L7          0.0226      0.022      1.012      0.312      -0.021       0.066
ar.L8         -0.0203      0.027     -0.750      0.453      -0.074       0.033
ar.L9          0.0147      0.028      0.533      0.594      -0.039       0.069
ma.L1          0.9678      0.190      5.099      0.000       0.596       1.340
ma.L2          0.1483      0.129      1.147      0.251      -0.105       0.402
ma.L3         -0.1243      0.097     -1.280      0.200      -0.315       0.066
ma.L4          0.5801      0.082      7.092      0.000       0.420       0.740
ma.L5          1.0090      0.158      6.406      0.000       0.700       1.318
ma.L6          0.5316      0.107      4.959      0.000       0.321       0.742
sigma2        10.7659      0.182     59.162      0.000      10.409      11.123
===================================================================================
Ljung-Box (L1) (Q):                   0.02   Jarque-Bera (JB):              3797.29
Prob(Q):                              0.88   Prob(JB):                         0.00
Heteroskedasticity (H):               9.58   Skew:                            -0.78
Prob(H) (two-sided):                  0.00   Kurtosis:                         9.55
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

p-value pre-difference: 0.8880358688505654
p-value post-difference: 2.0730845211488237e-26
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                  Close   No. Observations:                 2012
Model:                 ARIMA(2, 1, 2)   Log Likelihood                2478.303
Date:                Thu, 21 Aug 2025   AIC                          -4944.607
Time:                        01:11:54   BIC                          -4910.968
Sample:                             0   HQIC                         -4932.259
                               - 2012                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.0027      0.002      1.753      0.080      -0.000       0.006
ar.L1         -1.7993      0.014   -132.368      0.000      -1.826      -1.773
ar.L2         -0.9273      0.013    -70.667      0.000      -0.953      -0.902
ma.L1          1.7266      0.018     96.020      0.000       1.691       1.762
ma.L2          0.8382      0.017     47.923      0.000       0.804       0.873
sigma2         0.0050   6.48e-05     77.221      0.000       0.005       0.005
===================================================================================
Ljung-Box (L1) (Q):                   1.01   Jarque-Bera (JB):             10862.71
Prob(Q):                              0.31   Prob(JB):                         0.00
Heteroskedasticity (H):              11.53   Skew:                            -0.02
Prob(H) (two-sided):                  0.00   Kurtosis:                        14.39
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm (LSTM)                     │ (None, 50)             │        10,400 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 50)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 8)              │           408 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │             9 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 10,817 (42.25 KB)
 Trainable params: 10,817 (42.25 KB)
 Non-trainable params: 0 (0.00 B)
 1/16 ━━━━━━━━━━━━━━━━━━━ 1s 117ms/step

14/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step  

16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step

16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step

Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm_1 (LSTM)                   │ (None, 50)             │        10,400 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout)             │ (None, 50)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 8)              │           408 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 1)              │             9 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 10,817 (42.25 KB)
 Trainable params: 10,817 (42.25 KB)
 Non-trainable params: 0 (0.00 B)
 1/16 ━━━━━━━━━━━━━━━━━━━ 1s 115ms/step

14/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step  

16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step

16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step

Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm_2 (LSTM)                   │ (None, 50)             │        10,400 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_2 (Dropout)             │ (None, 50)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_4 (Dense)                 │ (None, 8)              │           408 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_5 (Dense)                 │ (None, 1)              │             9 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 10,817 (42.25 KB)
 Trainable params: 10,817 (42.25 KB)
 Non-trainable params: 0 (0.00 B)
 1/16 ━━━━━━━━━━━━━━━━━━━ 1s 113ms/step

14/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step  

16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step

16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step

Prophet MSFT:  MSE: 4269.369770815855 | MAE: 63.449096579809385 | R²: -11.99118901386011 | NMSE: 10.248100008460781 | NMAE: 0.15230179677599517
Prophet SWPPX:  MSE: 18.71373115155553 | MAE: 3.7204812362318984 | R²: -4.202185562404218 | NMSE: 1.6940116288494316 | NMAE: 0.3367868453410513
Prophet SPY:  MSE: 21542.44558570907 | MAE: 122.95392059079454 | R²: -4.212611793253368 | NMSE: 45.81756334096063 | NMAE: 0.2615046199037549
ARIMA MSFT:  MSE: 13688.300774052252 | MAE: 106.07123014448335 | R²: -2.4564387257887184 | NMSE: 37.720205983022815 | NMAE: 0.2922954949606996
ARIMA SWPPX:  MSE: 8.179969706222703 | MAE: 2.34659178870035 | R²: -1.025428315787289 | NMSE: 0.7305713644624706 | NMAE: 0.20957935377231218
ARIMA SPY:  MSE: 10106.699084182106 | MAE: 84.97058002767089 | R²: -1.2263694866717207 | NMSE: 21.283320465104367 | NMAE: 0.17893637376273652
LSTM MSFT:  MSE: 150.62202682849582 | MAE: 9.806085260902963 | R²: 0.9582695477558673 | NMSE: 0.041730452244132764 | NMAE: 0.026781565837953412
LSTM SWPPX:  MSE: 0.033592081720711535 | MAE: 0.13173709070779432 | R²: 0.9914872802849118 | NMSE: 0.008512719715088105 | NMAE: 0.011695382611037287
LSTM SPY:  MSE: 56.95006786867393 | MAE: 5.933259979496159 | R²: 0.9870693089375122 | NMSE: 0.012930691062487752 | NMAE: 0.012431404762889753
Stock Model MSE MAE NMSE NMAE
MSFT Prophet 4591.333 65.959 -12.971 11.021 0.158
MSFT ARIMA 13585.598 105.661 -2.431 37.437 0.291
MSFT LSTM 63.961 6.459 0.982 0.018 0.018
SWPPX Prophet 18.707 3.719 -4.200 1.693 0.337
SWPPX ARIMA 8.303 2.365 -1.056 0.742 0.211
SWPPX LSTM 0.033 0.132 0.992 0.008 0.012
SPY Prophet 21337.151 122.298 -4.163 45.381 0.260
SPY ARIMA 10072.653 84.816 -1.219 21.212 0.179
SPY LSTM 63.426 6.268 0.986 0.014 0.013

Discussion

Conclusion